-
Notifications
You must be signed in to change notification settings - Fork 31
Next #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Next #150
Conversation
If a user switches from test file to implementation file, it is useful that 'Test Current File' remembers and executes the actual last test file run. Changes: - We actually match the filename against `settings['pattern']`. Not all files with a '.py' suffix are test files. - We store the last used test file in the `window.settings()` which is cheap to implement and probably the right thing.
# Conflicts: # unittesting/core/st3/runner.py
|
We could add this to the |
|
Thanks. This looks more promising. |
|
We now have three options 'deferred', 'async', and 'fast_timings'. I think we can deprecated all of them. 'deferred=False' is slower than the current 'deferred=True' and has really limited functionality. (Since these are global switches, it is unlikely that a Sublime plugin never wants to wait for a Sublime doing something.) 'async=True' is not the way forward either. I can imagine that we run the framework on a separate thread, but again the feature set of the deferred TestCase is superior. But running the code under test on a 'third' thread which then delegates to Sublime's threads ... don't think that makes a lot of sense. As a tester you usually don't want more async, you want less. |
unittesting/core/st3/fast_runner.py
Outdated
| run_on_worker = sublime.set_timeout_async | ||
|
|
||
|
|
||
| class FastDeferringTextTestRunner(TextTestRunner): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does it different from DeferringTextTestRunner except DEFAULT_CONDITION_POLL_TIME?
It seems that they are almost identical.
|
You need to diff them, really diff --git a/unittesting/core/st3/runner.py b/unittesting/core/st3/fast_runner.py
index dff1fbf..3785aa8 100644
--- a/unittesting/core/st3/runner.py
+++ b/unittesting/core/st3/fast_runner.py
@@ -10,13 +10,15 @@ def defer(delay, callback, *args, **kwargs):
sublime.set_timeout(partial(callback, *args, **kwargs), delay)
+DEFAULT_CONDITION_POLL_TIME = 17
+DEFAULT_CONDITION_TIMEOUT = 4000
AWAIT_WORKER = 'AWAIT_WORKER'
# Extract `set_timeout_async`, t.i. *avoid* late binding, in case a user
# patches it
run_on_worker = sublime.set_timeout_async
-class DeferringTextTestRunner(TextTestRunner):
+class FastDeferringTextTestRunner(TextTestRunner):
"""This test runner runs tests in deferred slices."""
def run(self, test):
@@ -48,7 +50,7 @@ class DeferringTextTestRunner(TextTestRunner):
startTestRun()
try:
deferred = test(result)
- defer(10, _continue_testing, deferred)
+ _continue_testing(deferred)
except Exception as e:
_handle_error(e)
@@ -61,10 +63,10 @@ class DeferringTextTestRunner(TextTestRunner):
condition = deferred.send(send_value)
if callable(condition):
- defer(100, _wait_condition, deferred, condition)
+ defer(0, _wait_condition, deferred, condition)
elif isinstance(condition, dict) and "condition" in condition and \
callable(condition["condition"]):
- period = condition.get("period", 100)
+ period = condition.get("period", DEFAULT_CONDITION_POLL_TIME)
defer(period, _wait_condition, deferred, **condition)
elif isinstance(condition, int):
defer(condition, _continue_testing, deferred)
@@ -73,7 +75,7 @@ class DeferringTextTestRunner(TextTestRunner):
partial(defer, 0, _continue_testing, deferred)
)
else:
- defer(10, _continue_testing, deferred)
+ defer(0, _continue_testing, deferred)
except StopIteration:
_stop_testing()
@@ -82,21 +84,29 @@ class DeferringTextTestRunner(TextTestRunner):
except Exception as e:
_handle_error(e)
- def _wait_condition(deferred, condition, period=100, timeout=10000, start_time=None):
+ def _wait_condition(
+ deferred, condition,
+ period=DEFAULT_CONDITION_POLL_TIME,
+ timeout=DEFAULT_CONDITION_TIMEOUT,
+ start_time=None
+ ):
if start_time is None:
start_time = time.time()
try:
send_value = condition()
except Exception as e:
- defer(10, _continue_testing, deferred, throw_value=e)
+ _continue_testing(deferred, throw_value=e)
return
if send_value:
- defer(10, _continue_testing, deferred, send_value=send_value)
+ _continue_testing(deferred, send_value=send_value)
elif (time.time() - start_time) * 1000 >= timeout:
- self.stream.writeln("Condition timeout, continue anyway.")
- defer(10, _continue_testing, deferred)
+ error = TimeoutError(
+ 'Condition not fulfilled within {:.2f} seconds'
+ .format(timeout / 1000)
+ )
+ _continue_testing(deferred, throw_value=error)
else:
defer(period, _wait_condition, deferred, condition, period, timeout, start_time)
@@ -154,4 +164,4 @@ class DeferringTextTestRunner(TextTestRunner):
else:
self.stream.write("\n")
- sublime.set_timeout(_start_testing, 10)
+ _start_testing() |
|
We have quite a few places where we just continue sync. And we use '0' often which has different semantics bc it will also continue immediately after other immediate tasks have been run. ('0' will probably not yield back to Sublime's task scheduler but just append this task to end of the current loop.) |
|
You are copying This whole PR looks good to me in general except that I am really reluctant to have |
unittesting/core/st3/fast_runner.py
Outdated
| defer(condition, _continue_testing, deferred) | ||
| elif condition == AWAIT_WORKER: | ||
| run_on_worker( | ||
| partial(defer, 0, _continue_testing, deferred) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once a TestCase yields to AWAIT_WORKER, the whole test case will be executed in the worker thread.
Should we send the coroutine back to the main thread like this?
partial(defer, 0, lambda: sublime.set_timeout(_continue_testing), deferred)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already do the double-hop here to get it back on the main thread.
This line evaluates to
timeout_async(lambda: timeout(lambda: continue))
^^^ await worker ^^^ hop back ^^^actually continue
but this quickly gets confusing, so we may need a test here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I missed the point that defer is really set_timeout.
|
Subclassing is the wrong approach here. We don't want to maintain two or three versions of it. We actually ship the old version because Package Control doesn't do the semantic versioning right. As I see it, usually you would tag it 2.0, and people would just stick with their 1.x until they're ready. But we don't have that in Sublime land. So we basically ship two versions. (That's the same thing as when IE10 (InternetExplorer) ships actually a version of IE8 and IE9.) If we subclass, we can never guarantee that we do NOT break the old version. |
|
To be honest, backporting some features to the old Runner was maybe a complete waste of my time. |
|
Ok. I argue we should only keep one version of the runners. It could be done by
instead of If you agree and don't want to get your hands dirty. I might be able to find some time this weekend to implement the above idea. |
|
A plain In the PR I sometimes change Also, I think these explosion of settings confuse users. There is actually no continuum of 0..10ms yield time. 0 has different semantics. For the poll time, you want it low but you probably don't want it to be faster than 60hz, so I took 17ms. I don't think users should experiment with different settings here. They should switch one setting and get what we have right now plus what we might have later in Now, of course, I tried different things. But adding 10 if statements in one function, doesn't cut it, you need different implementation via subclassing or functional vodoo. I experimented a bit here, both subclassing and functional composition, in then end and to be honest it felt like Operette. The code here is already difficult to reason about (and for lesser experienced readers even more so), sharing reusable bits makes it more complicated bc you have more delegation and late binding. We usually oscillate between DRY and KISS, and here with that code at hand, it is not simple enough to make it DRY. I hope you have a really good idea. |
unittesting/core/st3/fast_runner.py
Outdated
| try: | ||
| send_value = condition() | ||
| except Exception as e: | ||
| _continue_testing(deferred, throw_value=e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please help me to understand. Why we cannot do defer(0, _continue_testing, deferred, throw_value=e)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, was it because of the coverage issue? I believe coverage would be ok with the sublime.set_timeout
Or it is because defer(0, ...) is slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, you're trying to modify the behavior to make the generalization/abstraction easier. That's cheating.
Now, sure both runners look similar but the execution flow is completely different. For sync programs, we can look at conditional branching (e.g. ifs) or method delegation to get the 'flow'. But this is async code, only matters where we run and when. The code we're looking at is an 'executor' for a (test) program.
Here I want fn() behavior which is sync and blocking, defer(0, fn) is not sync but cooperative (in the sense that other tasks in the queue can run before fn) but still blocking (in the sense that all these tasks marked as to-be-run-immediate run in a chain without gaps). defer(10, fn) is not sync and not blocking.
Please look at the DeferrableTestSuite in suite.py. We should remove all plain yields in run(). What is your preferred way of writing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might come down to our different expectations about synchronization of the test cases.
The major difference between the original runner and the new runner is whether we run the next test function/ test case in a deferred manner. It is also related to the plain yields in DeferrableTestSuite.
@@ -48,7 +50,7 @@ class DeferringTextTestRunner(TextTestRunner):
startTestRun()
try:
deferred = test(result)
- defer(10, _continue_testing, deferred)
+ _continue_testing(deferred)When I first wrote it, I thought it would be just easier to defer everywhere. I have to admit that it may be not very efficient and strictly necessary. The idea here is to make sure the UI has time to respond and the tests won't block the UI. The name of the runner also suggests that it runs tests in a deferring manner. This is not guaranteed that the code will be synchronous.
I understand your point of keeping the code clean and simple. But if we look at the diff of the original runner and the fast runner, there are actually not many lines which are different. It is hard to imagine why we should not refactor the runners to make it more customizable. The real issue here is whether we should defer between tests. The setting fast_timings actually does not explain explicitly what it is being faster. Perhaps we need a more specific setting to control whether the coroutine is executed asynchronously and by how much. I am still a bit leaning towards defer_timeout with the understanding that defer_timeout = 0 means fn() rather than timeout(fn, 0) in some of the lines.
The actual operation is actually hidden from the users. For the users, that number just means the number of seconds to wait. 0 just means "do not wait".
If you are worried about breakage, you have already protected yourself by writing the corresponding tests.
unittesting/core/st3/fast_runner.py
Outdated
| 'Condition not fulfilled within {:.2f} seconds' | ||
| .format(timeout / 1000) | ||
| ) | ||
| _continue_testing(deferred, throw_value=error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
unittesting/core/st3/fast_runner.py
Outdated
| else: | ||
| self.stream.write("\n") | ||
|
|
||
| _start_testing() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do defer(0, _start_testing)?
unittesting/core/st3/fast_runner.py
Outdated
| else: | ||
| self.stream.write("\n") | ||
|
|
||
| _start_testing() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we could replace the function calls above by defer(....), it is a strong reason that we should use the setting yield_timeout.
|
I'm not sure if we're discussing the features or the implementation here. Now, I think that this PR introduces a feature set that is superior and preferable to the old version. I expect all users switch to the new version. So the implementation strategy is driven by that. It protects the old implementation to never break if we never touch it. The new 'runner' is basically as messy or cleanly written as the old one, and the only one we further improve. Saying that, we can rewrite the new runner in a clean way without the risk to break the old one in probably subtle ways. We can also change our mind and doing the abstraction later because applying an abstraction is easier than reverting one. Completely different, when you think that users make a reasonable (not lazy) choice for the old implementation because it is superior for UI testing and for example leads to fewer flaky tests, we can sure implement both as 'strategies'. So that's the first point: I think the new implementation will replace the old one. The copy approach protects breaking changes. Now, why this feature set? Which features? Now I again must remind anyone that I started the PR with the title 'deterministic testing'. The speed comes for free. Choosing the runner is actually an implementation detail but the option a user can switch applies globally to the whole test suite or package. Current symptom: If you opt-in for 'deferred' even 'unittest.TestCase's run with wait time. Now a solid fundament for a test suite is lots of fine grained 'pure' test. Tests that either don't interact with the UI at all, or where these side-effects are actually mocked away. That's the case for 'test_user_args.py' from SublimeLinter. Here I as a user can expect maximum throughput. But I also want blocking behavior to ensure other plugins and Sublime don't see and interact with my mocks. This could be done by optimizing for the usage of 'unittest.TestCase' but by writing these tests I found that I still want the option to do some 'yield's (e.g. in setUp) because a test might at least need a window or view. (Creating a view is fixture setup not UI interaction testing. The test code uses getters on the view not setters.) That's the second point: For the probably vast majority of 'pure' tests you want sync and blocking. Do we need waiting time between tests? I do not tackle this here because that code is in suite.py. Depending on the test we probably need waiting time between them. But suite uses plain 'yield's or double-'yield's which correspond to ~20+ms for master. But 10 or 20ms are meaningless here. You (usually) want either no waiting time, 0 to process the main thread tasks, or 'AWAIT_WORKER' to await all the async events. (Basically to run the task queues empty.) (Side note: Sublime does NOT generally paint within 10 or 20ms!) Can we substitute would be the same as which of course doesn't make sense. When and why do I use direct invocation? Most important when the 'condition' resolves either by throwing, timeout or successfully. Going from A -> B we add waiting time which is what the user intends anyway. Going from B -> C we flow directly. Between A and B the user expects that the world changes, but going from B to C we must do our best that the state of the world stays consistent. That's the idea here. What holds true in Other than that I do it for (That makes it difficult: I think 1. both calls should be direct invocations as it is in my PR. 2. We shouldn't hop to the worker thread instead package_reloader which is right now a function that can only run on threads should accept a continuation function and do the thread hopping by itself. It is a normal function so it shouldn't care in what environment the user calls it. If it needs that environment it's the responsibility of the function to prepare this environment. 3. We shouldn't poll (on the main thread) for |
|
TLDR, I am ok with the idea of replacing the old runner with your new runner. Even a little bit breakage is okay. TBH, not many people is using the deferred option. If necessary, I could do a quick search on GitHub and remind every repo which uses UnitTesting. |
|
Okay, just let it sink, maybe I can come up with dryer version. |
|
How about making the new runner the default and rename the original runner LegacyDeferringTestRunner and add a setting |
Basically, the old runner is now called LegacyDeferringTextTestRunner, and the former FastDeferringTextTestRunner is the new DeferrableTestCase.
|
Okay, enough testing. There is something in master that makes 'circleci: build-linux' flaky. (Or the combination of this PR plus master.) |
|
it looks all good to me now. Merged now. At some point, we might want to set the default value of |
|
I will make a new release in a few days if everything runs smoothly |
|
Yeah, that's the plan. Write a good update message. Essentially we remove implicit waiting time so users have to make their needed waiting time explicit. 'AWAIT_WORKER' can be used here. Also users should almost always prefer Is it we're lucky that 'circleci' is green on master, or did you change some setting here? |
|
I didn’t change any settings. Maybe you were being unlucky to catch some random glitches. |
Closes #146
Closes #147
Closes #148
Closes #149
Because in the end #146 and parts of #147 are breaking, we need to put them behind a feature flag. I used
fast_timingshere. The default is for obvious reasons 'False'. I recommend to deprecate the oldDeferringTextTestRunnerbecause it has no advantage over the new one.I back-ported as many features as possible to the old one though.
All users get:
Users opting 'fast_timings' get:
yield. I highly recommend using this if the app/plugin you're building allows it.Notable fix
add/doCleanupsworksWhy is the system now so fast?
We do not add implicit waiting time anymore.
What to do if the test fail after opting 'fast_timings'?
Essentially, add explicit waiting time using one of the
yieldforms. E.g. for the failing GitSavvy tests (discussed in #146) it is enough toyield AWAIT_WORKERontearDown. This is reasonable, and it doesn't make the tests as slow as before. The runtime of the GitSavvy test is still cut in half. (Note that AWAIT_WORKER is introduced by this PR, you usually used number guessing likeyield 300et.al. before.)